-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Fix concurrent tool installation race conditions with named mutex #51834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: baronfel <[email protected]>
| if (!mutex.WaitOne(TimeSpan.FromMinutes(5))) | ||
| { | ||
| throw new ToolPackageException(string.Format(CliStrings.ToolInstallationTimeout, packageId, packageVersion)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Do two things here to make the user experience better:
- do a short initial WaitOne for a TimeSpan of a very short time - like 50ms - to see if the mutex is free. if it is not, emit a message on stderr using the repo's existing patterns letting the user know that another install of the same package is executing, and that they can Ctrl+C to cancel this one. then do the longer-lived WaitOne for the extended duration.
- ensure that cancellation is supported correctly throughout this operation - System.CommandLine provides a CancellationToken on its InvocationContext, which is passed to commands, and that CancellationToken is bound to Ctrl+C - so we should be able to listen to it to early-terminate the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented the two-stage mutex acquisition with user feedback in commit e779d23:
-
Initial quick check (50ms): Attempts to acquire the mutex immediately. If successful, proceeds without delay.
-
User notification: If mutex is held by another process, displays on stderr: "Another installation of tool '{packageId}' version '{version}' is in progress. Waiting for it to complete... (Press Ctrl+C to cancel)"
-
Extended wait: Then waits up to 5 minutes for the mutex.
Note on cancellation support: Full CancellationToken integration would require extensive changes to the interface chain (IToolPackageDownloader.InstallPackage → ToolPackageDownloaderBase.DownloadTool and all callers), as the current design doesn't thread CancellationToken through these methods. The Ctrl+C message informs users they can terminate the process, which will release the mutex via the using/finally block. Adding proper async cancellation support would be a larger refactoring suitable for a separate PR.
- Add 50ms initial WaitOne to quickly check if mutex is available - Display informative message when waiting for another installation - Tell users they can press Ctrl+C to cancel - Maintains 5-minute total timeout for full wait Co-authored-by: baronfel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request implements concurrent tool installation protection using named mutexes to prevent "file being used by another process" errors when multiple dotnet CLI processes attempt to install the same .NET tool simultaneously.
Key changes:
- Adds named mutex synchronization (
tool-install-{packageId}-{packageVersion}) to serialize concurrent installations per package/version combination - Implements two-stage mutex acquisition with user notification for better UX
- Adds comprehensive test coverage for concurrent installation scenarios
Reviewed Changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/Cli/dotnet/ToolPackage/ToolPackageDownloaderBase.cs | Core implementation: wraps DownloadTool method with named mutex, includes timeout handling and mutex name generation |
| src/Cli/dotnet/CliStrings.resx | Adds two new error messages: ToolInstallationTimeout and ToolInstallationWaiting for user communication |
| src/Cli/dotnet/xlf/*.xlf | Adds localization entries for the new error messages across 13 language files (all marked as state="new") |
| test/Microsoft.DotNet.PackageInstall.Tests/ToolPackageDownloaderTests.cs | Adds GivenConcurrentInstallationsTheyDoNotConflict test with concurrent Task.Run executions; includes minor formatting fixes (alphabetized using statements, whitespace cleanup) |
| try | ||
| { | ||
| DownloadAndExtractPackage(packageId, nugetPackageDownloader, packageDownloadDir.Value, packageVersion, packageSourceLocation, includeUnlisted: givenSpecificVersion, verbosity: verbosity); | ||
| } | ||
| // First try a quick check to see if the mutex is immediately available | ||
| if (!mutex.WaitOne(TimeSpan.FromMilliseconds(50))) | ||
| { | ||
| // Mutex is held by another process - inform the user | ||
| Reporter.Error.WriteLine(string.Format(CliStrings.ToolInstallationWaiting, packageId, packageVersion)); | ||
|
|
||
| CreateAssetFile(packageId, packageVersion, packageDownloadDir, Path.Combine(assetFileDirectory.Value, ToolPackageInstance.AssetsFileName), _runtimeJsonPath, verbosity, targetFramework); | ||
| // Now wait for the longer duration | ||
| if (!mutex.WaitOne(TimeSpan.FromMinutes(5))) | ||
| { | ||
| throw new ToolPackageException(string.Format(CliStrings.ToolInstallationTimeout, packageId, packageVersion)); | ||
| } | ||
| } |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mutex acquisition logic should handle AbandonedMutexException, which can be thrown when a previous process holding the mutex terminated abnormally without releasing it. When this exception is caught, the calling thread has acquired the mutex and can safely proceed with the installation. Consider wrapping the WaitOne calls in a try-catch block that handles AbandonedMutexException.
Example:
try
{
if (!mutex.WaitOne(TimeSpan.FromMilliseconds(50)))
{
// ... existing code ...
}
}
catch (AbandonedMutexException)
{
// Mutex was abandoned by another process, but we now own it
// This is safe to proceed
}| finally | ||
| { | ||
| mutex.ReleaseMutex(); | ||
| } |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ReleaseMutex() call in the finally block can throw ApplicationException if the mutex is not currently owned by the calling thread (e.g., if an exception was thrown before the mutex was acquired, or if an AbandonedMutexException occurred but wasn't properly caught). This could mask the original exception. Consider checking if the mutex was successfully acquired before attempting to release it.
Example:
bool mutexAcquired = false;
try
{
if (!mutex.WaitOne(TimeSpan.FromMilliseconds(50)))
{
// ...
mutexAcquired = mutex.WaitOne(TimeSpan.FromMinutes(5));
}
else
{
mutexAcquired = true;
}
if (mutexAcquired)
{
// ... installation logic ...
}
}
finally
{
if (mutexAcquired)
{
mutex.ReleaseMutex();
}
}| private static string GetToolInstallMutexName(PackageId packageId, NuGetVersion packageVersion) | ||
| { | ||
| // Create a mutex name in the format: tool-install-{packageId}-{packageVersion} | ||
| // Replace characters that are invalid in mutex names with underscores | ||
| string safeName = $"tool-install-{packageId}-{packageVersion.ToNormalizedString()}" | ||
| .Replace('/', '_') | ||
| .Replace('\\', '_'); | ||
|
|
||
| return safeName; | ||
| } |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mutex name sanitization only replaces / and \ characters, but mutex names on Windows have additional restrictions. Windows mutex names:
- Cannot exceed 260 characters
- Cannot contain certain special characters beyond
/and\
Package IDs and versions could contain other problematic characters (e.g., + in semver build metadata). Consider using a more robust sanitization approach or creating a hash-based name for long/complex package identifiers.
Example:
private static string GetToolInstallMutexName(PackageId packageId, NuGetVersion packageVersion)
{
string baseName = $"tool-install-{packageId}-{packageVersion.ToNormalizedString()}";
// If the name is too long or contains problematic characters, use a hash
if (baseName.Length > 200 || !IsValidMutexName(baseName))
{
using var sha256 = SHA256.Create();
var hash = Convert.ToBase64String(sha256.ComputeHash(Encoding.UTF8.GetBytes(baseName)))
.Replace('/', '_')
.Replace('+', '-');
return $"tool-install-{hash}";
}
return baseName.Replace('/', '_').Replace('\\', '_');
}| if (!mutex.WaitOne(TimeSpan.FromMilliseconds(50))) | ||
| { | ||
| // Mutex is held by another process - inform the user | ||
| Reporter.Error.WriteLine(string.Format(CliStrings.ToolInstallationWaiting, packageId, packageVersion)); |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The waiting message is written to Reporter.Error, but this is informational rather than an error. According to the PR description, this message is meant to provide "transparency about what's happening." Consider using Reporter.Output instead, as this is normal operational information, not an error condition.
Reporter.Output.WriteLine(string.Format(CliStrings.ToolInstallationWaiting, packageId, packageVersion));| Reporter.Error.WriteLine(string.Format(CliStrings.ToolInstallationWaiting, packageId, packageVersion)); | |
| Reporter.Output.WriteLine(string.Format(CliStrings.ToolInstallationWaiting, packageId, packageVersion)); |
| // Use a named mutex to serialize concurrent installations of the same tool package | ||
| string mutexName = GetToolInstallMutexName(packageId, packageVersion); | ||
| using var mutex = new Mutex(false, mutexName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry about the scenario where a computer running Windows has two Jenkins agents installed as services with separate user accounts and separate file-system directories, and the agents run dotnet tool install on the same tool in parallel. Then this code will construct the same mutexName string in both processes, and because all services run in session 0, they will attempt to open the same mutex object; but because they have separate user accounts, the DACL of the mutex might not allow the second open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'd suggest including a hash of the directory path where the tool is going to be installed.
Concurrent installations of the same .NET tool fail with "file being used by another process" errors when multiple
dotnetCLI processes attempt to download/extract packages simultaneously to the shared NuGet cache.Changes
ToolPackageDownloaderBase.cs: Wrap
DownloadToolwith a named mutextool-install-{packageId}-{packageVersion}to serialize concurrent installations per package version. Mutex has a 5-minute timeout with clear error messaging. Implements two-stage acquisition: quick 50ms check followed by user notification if waiting is required.CliStrings.resx: Add
ToolInstallationTimeouterror message for mutex timeout scenarios andToolInstallationWaitingmessage to inform users when waiting for concurrent installations.ToolPackageDownloaderTests.cs: Add
GivenConcurrentInstallationsTheyDoNotConflicttest verifying no file access conflicts occur during concurrent installations.Scope
The mutex protects the critical section from package existence check through download, extraction, and asset file creation. Tool execution remains outside the mutex. RID-specific packages are also protected.
User Experience
When a concurrent installation is detected, users are immediately informed with:
This provides transparency about what's happening and allows users to cancel if desired.
Fixes #51831
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.